Ideal amino acid exchange forms for approximating substitution matrices.

نویسندگان

  • Piotr Pokarowski
  • Andrzej Kloczkowski
  • Szymon Nowakowski
  • Maria Pokarowska
  • Robert L Jernigan
  • Andrzej Kolinski
چکیده

We have analyzed 29 published substitution matrices (SMs) and five statistical protein contact potentials (CPs) for comparison. We find that popular, 'classical' SMs obtained mainly from sequence alignments of globular proteins are mostly correlated by at least a value of 0.9. The BLOSUM62 is the central element of this group. A second group includes SMs derived from alignments of remote homologs or transmembrane proteins. These matrices correlate better with classical SMs (0.8) than among themselves (0.7). A third group consists of intermediate links between SMs and CPs - matrices and potentials that exhibit mutual correlations of at least 0.8. Next, we show that SMs can be approximated with a correlation of 0.9 by expressions c(0) + x(i)x(j) + y(i)y(j) + z(i)z(j), 1<or= i, j <or= 20, where c(0) is a constant and the vectors (x(i)), (y(i)), (z(i)) correlate highly with hydrophobicity, molecular volume and coil preferences of amino acids, respectively. The present paper is the continuation of our work (Pokarowski et al., Proteins 2005;59:49-57), where similar approximation were used to derive ideal amino acid interaction forms from CPs. Both approximations allow us to understand general trends in amino acid similarity and can help improve multiple sequence alignments using the fast Fourier transform (MAFFT), fast threading or another methods based on alignments of physicochemical profiles of protein sequences. The use of this approximation in sequence alignments instead of a classical SM yields results that differ by less than 5%. Intermediate links between SMs and CPs, new formulas for approximating these matrices, and the highly significant dependence of classical SMs on coil preferences are new findings.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The construction of amino acid substitution matrices for the comparison of proteins with non-standard compositions

MOTIVATION Amino acid substitution matrices play a central role in protein alignment methods. Standard log-odds matrices, such as those of the PAM and BLOSUM series, are constructed from large sets of protein alignments having implicit background amino acid frequencies. However, these matrices frequently are used to compare proteins with markedly different amino acid compositions, such as trans...

متن کامل

Amino acid substitution matrices for protein conformation identification

Methods for alignment of protein sequences typically measure similarity by using substitution matrix with scores for all possible exchanges of one amino acid with another. Although widely used, the matrices derived from homologous sequence segments, such as Dayhoff’s PAM matrices and Henikoff’s BLOSUM matrices, are not specific for protein conformation identification. Using a different approach...

متن کامل

Position Dependent and Independent Evolutionary Models Based on Empirical Amino Acid Substitution Matrices

Evolutionary models measure the probability of amino acid substitutions occurring over different evolutionary distances. We examine various evolutionary models based on empirically derived amino acid substitution matrices. The models are constructed using the PAM and BLOSUM amino acid substitution matrices. We rescale these matrices by raising them to powers to model substitution patterns that ...

متن کامل

Amino Acid Substitution Matrices Estimated by Maximum Likelihood

The present work describes protrates, a program that estimates amino acid substitution matrices and among-site substitution rates based on their likelihood for a given tree topology and a dataset of aligned proteins. The issue of producing maximum likelihood (ML) rate matrices over protein data have been adressed under the framework of general-purpose unbiased substitution matrices [1, 9], sinc...

متن کامل

Genome bias influences amino acid choices: analysis of amino acid substitution and re-compilation of substitution matrices exclusive to an AT-biased genome

The genomic era has seen a remarkable increase in the number of genomes being sequenced and annotated. Nonetheless, annotation remains a serious challenge for compositionally biased genomes. For the preliminary annotation, popular nucleotide and protein comparison methods such as BLAST are widely employed. These methods make use of matrices to score alignments such as the amino acid substitutio...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Proteins

دوره 69 2  شماره 

صفحات  -

تاریخ انتشار 2007